Efficient Data Mining for Path Traversal Patterns
نویسندگان
چکیده
In this paper, we explore a new data mining capability that involves mining path traversal patterns in a distributed information-providing environment where documents or objects are linked together to facilitate interactive access. Our solution procedure consists of two steps. First, we derive an algorithm to convert the original sequence of log data into a set of maximal forward references. By doing so, we can filter out the effect of some backward references, which are mainly made for ease of traveling and concentrate on mining meaningful user access sequences. Second, we derive algorithms to determine the frequent traversal patterns¦i.e., large reference sequences¦from the maximal forward references obtained. Two algorithms are devised for determining large reference sequences; one is based on some hashing and pruning techniques, and the other is further improved with the option of determining large reference sequences in batch so as to reduce the number of database scans required. Performance of these two methods is comparatively analyzed. It is shown that the option of selective scan is very advantageous and can lead to prominent performance improvement. Sensitivity analysis on various parameters is conducted.
منابع مشابه
Mining Top-K Path Traversal Patterns over Streaming Web Click-Sequences
Online, one-pass mining Web click streams poses some interesting computational issues, such as unbounded length of streaming data, possibly very fast arrival rate, and just one scan over previously arrived Web click-sequences. In this paper, we propose a new, single-pass algorithm, called DSM-TKP (Data Stream Mining for Top-K Path traversal patterns), for mining a set of top-k path traversal pa...
متن کاملMining Web navigation patterns with a path traversal graph
With the expansion of e-commerce and mobile-based commerce, the role of web user on World Wide Web has become pivotal enough to warrant studies to further understand the user’s intent, navigation patterns on websites and usage needs. Using web logs on the servers hosting websites, site owners and in turn companies, can extract information to better understand and predict user’s needs, tailoring...
متن کاملDSM-PLW: Single-pass mining of path traversal patterns over streaming Web click-sequences
Mining Web click streams is an important data mining problem with broad applications. However, it is also a difficult problem since the streaming data possess some interesting characteristics, such as unknown or unbounded length, possibly a very fast arrival rate, inability to backtrack over previously arrived click-sequences, and a lack of system control over the order in which the data arrive...
متن کاملMining Objects Correlations to Improve Interactive Virtual Reality Latency
Object correlations are common semantic patterns in virtual reality systems. They can be exploited for improving the effectiveness of storage caching, prefecthing, data layout, and minimization of queryresponse times. Unfortunately, this information about object correlations is unavailable at the storage system level. Previous approaches for reducing I/O access time are seldom investigated. On ...
متن کاملReview on Path Traversal for Web Navigation Mining
Web Navigation Pattern is point comes under Web Usage Mining which shows how one can visited from one page to another i.e. it shows navigational behaviour. Mostly this pattern mining is success part of ecommerce and mobile commerce. Analysing this data will help the organizations to realize the lifetime value of their clients, and provide them with a more sophisticated structure of the web site...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Knowl. Data Eng.
دوره 10 شماره
صفحات -
تاریخ انتشار 1998